Wind speed prediction method based on forward complex-valued neural network

ABSTRACT

The invention discloses a wind speed prediction method based on feedforward complex-valued neural network (FCVNN), including: acquiring a training set and a prediction set for wind speed prediction, and constructing a FCVNN, and initializing a parameter vector; and introducing a Group Lasso regularization term into a target function for training, transferring the training into solving a constrained optimization problem, training the FCVNN by using the training set and a specialized complex-valued projected quasi-newton algorithm, stopping the training until a preset condition is met, obtaining the trained FCVNN, and inputting the prediction set into the established FCVNN to obtain a wind speed prediction result. A Group Lasso regularization term is introduced and a FCVNN is trained by using a specialized complex-valued projected quasi-Newton algorithm to optimize the network structure and parameters, thereby obtaining a compact network structure and high generalization performance and improving the accuracy of wind speed prediction.

FIELD OF THE INVENTION

The present invention relates to the field of wind speed predictiontechnologies, and more particularly to a wind speed prediction methodbased on a feedforward complex-valued neural network.

DESCRIPTION OF THE RELATED ART

Compared with some traditional non-renewable energy sources such as oil,wind energy has attracted more and more attention as a green andenvironmentally friendly renewable energy source. The development ofwind energy has become a current trend. However, due to the random andintermittent nature of wind speed, the instability of wind speed maypose a threat to the safety and stability of a power grid system.Therefore, accurate wind speed prediction plays a crucial role in thedevelopment of wind energy.

Currently, there are mainly two kinds of wind speed prediction methods.One is a physical model prediction method based on weather forecastdata, and the other is a wind speed prediction method based onhistorical data. However, due to the lack of numerical meteorologicalinformation, the physical model prediction method based on weatherforecast data is relatively seldom used. Therefore, dynamic changes inwind speed are mostly predicted by using historical data. The predictionof dynamic changes in wind speed based on historical data is widely usedin wind power stations. Common methods include the prediction of dynamicchanges in wind speed by using artificial neural networks, supportvector machines, and Kalman filtering.

As a simple and effective modeling method, artificial neural networkshave excellent nonlinear mapping and approximation capabilities, andhave been widely used in wind speed prediction and related applicationsin recent years. However, when artificial neural network models are usedfor wind speed prediction, it is often difficult to achieve the expectedperformance in wind speed prediction due to inappropriate design of anetwork structure. Therefore, the selection of an appropriate networkstructure is an urgent problem to be resolved for artificial neuralnetwork methods. The simplest method is to determine a relativelyappropriate structure through manual trial and error. However, thismethod is time-consuming and laborious. In addition, a gradient descentmethod is widely used in a training process of feedforward neuralnetworks to obtain appropriate network weights and biases, but thegradient descent method is prone to problems such as falling into localminima and slow convergence. Therefore, how to design an appropriatetraining method to find an appropriate network structure and appropriateparameters also deserves further research.

SUMMARY OF THE INVENTION

For this, a technical problem to be resolved by the present invention isto overcome disadvantages in the prior art, and provide a wind speedprediction method based on a feedforward complex-valued neural networkthat can implement optimization of both a network structure andparameters.

To resolve the foregoing technical problems, the present inventionprovides a wind speed prediction method based on a feedforwardcomplex-valued neural network, including the following steps:

step 1: acquiring data used for wind speed prediction, arranging thedata as a data set, and dividing the data set into a training set and aprediction set;

step 2: constructing a feedforward complex-valued neural network, andinitializing a parameter vector ψ in the feedforward complex-valuedneural network, where the parameter vector ψ is formed by adjustableparameters including connection weights between neurons, biases ofneurons, and gain coefficients of activation functions;

step 3: introducing a Group Lasso regularization term to construct atarget function during the training of the feedforward complex-valuedneural network, and converting the training of the feedforwardcomplex-valued neural network into solving a constrained optimizationproblem; and

-   -   training the feedforward complex-valued neural network by using        the training set and a specialized complex-valued projected        quasi-Newton algorithm, and stopping the training until a preset        iteration termination condition is met; and

step 4: obtaining the trained feedforward complex-valued neural network,and inputting the prediction set into the trained feedforwardcomplex-valued neural network to obtain a wind speed prediction result.

Preferably, in step 2, the constructed feedforward complex-valued neuralnetwork includes P input neurons, N hidden neurons, and Q outputneurons, the parameter vector ψ in the feedforward complex-valued neuralnetwork is a column vector, and all the adjustable parameters arearranged in an order to obtain the parameter vector ψ:

$\Psi = {\begin{bmatrix}{\left( w_{1}^{R} \right)^{T},\left( w_{1}^{I} \right)^{T},\ldots,\left( w_{P}^{R} \right)^{T},\left( w_{P}^{I} \right)^{T},\left( b_{1}^{R} \right)^{T},\left( b_{1}^{I} \right)^{T},\left( \sigma_{1}^{R} \right)^{T},} \\{\left( \sigma_{1}^{I} \right)^{T},\left( v_{1}^{R} \right)^{T},\left( v_{1}^{R} \right)^{T},\ldots,\left( v_{N}^{R} \right)^{T},\left( v_{N}^{I} \right)^{T},\left( b_{2}^{I} \right)^{T},\left( b_{2}^{R} \right)^{T},\left( \sigma_{2}^{R} \right)^{T},\left( \sigma_{2}^{I} \right)^{T}}\end{bmatrix}^{T} = \begin{bmatrix}{\left( {{ri}\left( w_{1} \right)} \right)^{T},\ldots,\left( {{ri}\left( w_{P} \right)} \right)^{T},\left( {{ri}\left( b_{1} \right)} \right)^{T},\left( \sigma_{1}^{R} \right)^{T},\left( \sigma_{1}^{I} \right)^{T},} \\{\left( {{ri}\left( v_{1} \right)} \right)^{T},\ldots,\left( {{ri}\left( v_{N} \right)} \right)^{T},\left( {{ri}\left( b_{2} \right)} \right)^{T},\left( \sigma_{2}^{R} \right)^{T},\left( \sigma_{2}^{I} \right)^{T}}\end{bmatrix}^{T}}$

where w_(p) represents a complex vector formed by connection weightsbetween a p^(th) input neuron and the hidden neurons, b₁ represents acomplex vector formed by biases of all the hidden neurons, σ₁ representsa complex vector formed by gain coefficients of activation functions ofthe hidden neurons, v_(n), represents a complex vector formed byconnection weights between an n^(th) hidden neuron and the outputneurons, b₂ is a complex vector formed by biases of all the outputneurons, σ₂ represents a complex vector formed by gain coefficients ofactivation functions of the output neurons, and the superscript Trepresents transpose; and the superscript R represents a vector formedby real parts of corresponding complex vector, the superscript Irepresents a vector formed by imaginary parts of corresponding complexvector, and

${{{ri}( \bullet )} = \begin{pmatrix}( \bullet )^{R} \\( \bullet )^{I}\end{pmatrix}};$

and

-   -   a hidden output vector of the feedforward complex-valued neural        network is h_(j)=f_(C)(Wz_(j)+b₁), and an output vector of the        output layer is o_(j)=f_(C)(Vh_(j)+b₂), where f_(C)(⋅)        represents an activation function, W=[w₁, w₂, . . . , w_(P)] is        a weight matrix between the input and hidden neurons, and z_(j)        is a j^(th) input sample of the feedforward complex-valued        neural network; and V=[v₁, v₂, . . . , v_(N)] is a weight matrix        between the hidden and output neurons.

Preferably, a specific process of step 3 includes:

step 3.1: introducing a Group Lasso regularization term R_(GL) into atraditional mean square error function E′ to obtain a target function Eduring the training of the feedforward complex-valued neural network;

step 3.2: introducing a group of artificial variables ρ_(a) to convertan unconstrained optimization problem

${\min\limits_{\psi}E} = {E^{\prime} + {\lambda{\sum_{a = 1}^{A}{\sqrt{❘\psi_{a}❘}{\psi_{a}}}}}}$

into a constrained optimization problem

${{\min\limits_{\overset{\_\_}{\psi}}E} = {E^{\prime} + {\lambda{\sum_{a = 1}^{A}{\sqrt{❘\psi_{a}❘}\rho_{a}}}}}},$s.t. ψ_(a) ≤ ρ_(a), a = 1, 2, …, A;

and

-   -   defining ρ=[ρ₁, ρ₂, . . . , ρ_(A)]^(T) as a real vector formed        by the introduced variables ρ_(a), where a parameter vector that        needs to be optimized during the training process in this case        is ψ=[ψ^(T), ρ^(T)]^(T);

step 3.3: calculating an approximate Hessian matrix H^((m)) by usingmatrices S={s^((i))}_(i=m-τ) ^(i=m-1) and R={r^((i))}_(i=m-τ) ^(i=m-1),and obtaining an approximate quadratic optimization problem with aconstraint condition, where τ is a constant, representing that aparameter variation s^(i) and a gradient variation) r^((i)) of thelatest τ iterations are kept; and s^((i))=ψ ^((i+1))−ψ ^((i)),r^((i))=∇E(ψ ^((i+1)))−∇E(ψ ^((i))), ψ ^((i+1)) represents a parametervector value of ψ at the (i+1)^(th) iteration, ψ ^((i)) represents aparameter vector value of ψ at the i^(th) iteration, ∇E(ψ ^((i)))represents a gradient of the target function E at ψ ^((i)), ∇E(ψ^((i+1))) represents a gradient of the target function E at ψ ^((i+1)),S represents a matrix formed by parameter variations s^((i)) from the(m−τ)^(th) to (m−1)^(th) iterations, R represents a matrix formed bygradient variations r^((i)) from the (m−τ)^(th) to (m−1)^(th)iterations, and m represents an iteration number;

step 3.4: solving the approximate quadratic optimization problem with aconstraint condition by using a spectral projected gradient algorithm,to obtain a solution ψ* of the approximate quadratic optimizationproblem with a constraint condition;

step 3.5: calculating a feasible descending direction d^((m)) of theoriginal constrained optimization problem

${{\min\limits_{\overset{\_}{\psi}}E} = {E^{\prime} + {\lambda{\sum_{a = 1}^{A}{\sqrt{❘\psi_{a}❘}\rho_{a}}}}}},$s.t. ψ_(a) ≤ ρ_(a), a = 1, 2, …, A

at the m^(th) iteration by using the solution ψ* of the approximatequadratic optimization problem with a constraint condition, and usingArmijo line search to determine an approximate learning step sizeη^((m));

step 3.6: updating ψ by using d^((m)) and η^((m)), and updating thematrices S and R;

step 3.7: repeating step 3.3 to step 3.6, and stopping the training ofthe feedforward complex-valued neural network until the preset iterationtermination condition is met.

Preferably, the Group Lasso regularization term in step 3.1 isR_(GL)=λΣ_(a=1) ^(A)√{square root over (|ψ_(a)|)}∥ψ_(a)∥, and the targetfunction E for the training of the feedforward complex-valued neuralnetwork is:

${{{\min\limits_{\psi}E} + E^{\prime} + R_{GL}} = {{\frac{1}{2J}{\sum_{j = 1}^{J}{\left( {o_{j} - y_{j}} \right)^{H}\left( {o_{j} - y_{j}} \right)}}} + {\lambda{\sum_{a = 1}^{A}{\sqrt{❘\psi_{a}❘}{\psi_{a}}}}}}},$where$E^{\prime} = {\frac{1}{2J}{\sum_{j = 1}^{J}{\left( {o_{j} - y_{j}} \right)^{H}\left( {o_{j} - y_{j}} \right)}}}$

is the traditional mean square error function, J is a total number oftraining samples, o_(j) represents an actual output of a j^(th) trainingsample, y_(j) represents a desired output of the j^(th) training sample,and the superscript H represents the conjugate transpose; λ is aregularization coefficient, a=1,2, . . . , A, A=P+N+2 represents a totalnumber of neurons that may be penalized, that is, P input neurons, N ohidden neurons, and 2 bias nodes; and |⋅| represents a dimensionality ofa vector, ∥⋅∥ is a Euclidean norm, and ψ_(a) represents a vector formedby connection weights between an a^(th) neuron and all neurons in a nextlayer in the feedforward complex-valued neural network.

Preferably, a specific process of the step 3.3 includes:

step 3.3.1: using a calculation formula of the approximate Hessianmatrix H^((m)):

H ^((m))=σ^((m)) I−NM ⁻¹ N ^(T),

-   -   where

${\sigma^{(m)} = \frac{\left( r^{({m - 1})} \right)^{T}r^{({m - 1})}}{\left( r^{({m - 1})} \right)^{T}s^{({m - 1})}}},$

r^((m-1))=∇E(ψ ^((m)))−∇E(ψ ^((m-1))), and s^((m-1))=ψ ^((m))−ψ^((m-1)); N=[σ^((m))S R],

${M = \begin{bmatrix}{\sigma^{(m)}S^{T}S} & L \\L & {- D}\end{bmatrix}},$

L is a matrix formed by elements

$L_{ij} = \left\{ {\begin{matrix}{\left( s^{({m - \tau - 1 + i})} \right)^{T}\left( r^{({m - \tau - 1 + j})} \right)} & {i > j} \\0 & {i \leq j}\end{matrix},} \right.$

I is an identity matrix, and D=diag[(s^((m-τ)))^(T)(r^((m-τ))), . . . ,(s^((m-1)))^(T)(r^((m-1)))] is a diagonal matrix; and

step 3.3.2: obtaining the approximate quadratic optimization problemwith a constraint condition at the m^(th) iteration by using theapproximate Hessian matrix H^((m)):

${\min\limits_{\overset{\_}{\psi}}Q} = {{E\left( {\overset{\_}{\psi}}^{(m)} \right)} + {\left( {\overset{\_}{\psi} - {\overset{\_}{\psi}}^{(m)}} \right)^{T}{\nabla{E\left( {\overset{\_}{\psi}}^{(m)} \right)}}} + {\frac{1}{2}\left( {\overset{\_}{\psi} - {\overset{\_}{\psi}}^{(m)}} \right)^{T}{H^{(m)}\left( {\overset{\_}{\psi} - {\overset{\_}{\psi}}^{(m)}} \right)}^{T}}}$s.t. ψ_(a) ≤ ρ_(a), a = 1, 2, …, A

Preferably, the step 3.4 includes:

step 3.4.1: calculating η_(bb) ^((t)) by using a formula

${\eta_{bb}^{(t)} = \frac{\left( {qs}^{({t - 1})} \right)^{T}{qs}^{({t - 1})}}{\left( {qr}^{({t - 1})} \right)^{T}{qs}^{({t - 1})}}},$

and calculating ψ ₊ ^((t)) according to a formula ψ ₊ ^((t))=ψ^((t))−η_(bb) ^((t))∇Q(ψ ^((t))), where qs^((t-1))=ψ ^((t))−ψ ^((t-1)),qr^((t-1))=∇Q(ψ ^((t)))−∇Q(ψ ^((t-1))), ∇Q(ψ ^((t))) represents agradient of the target function of the approximate quadraticoptimization problem with a constraint condition at ψ ^((t)), and trepresents the iteration number for optimizing the approximate quadraticoptimization problem with a constraint condition by using the spectralprojected gradient algorithm;

step 3.4.2: correcting parameters of each group of neurons in ψ ₊ ^((t))by using a projection operator

${P_{\Omega}\left( {\psi_{a},\rho_{a}} \right)} = \left\{ \begin{matrix}\left( {\psi_{a},\rho_{a}} \right) & {{\psi_{a}} \leq \rho_{a}} \\\left( {{\frac{\psi_{a}}{\psi_{a}}\frac{{\psi_{a}} \leq \rho_{a}}{2}},\frac{{\psi_{a}} \leq \rho_{a}}{2}} \right) & {{{\psi_{a}} > \rho_{a}},{{{\psi_{a}} + \rho_{a}} > 0}} \\\left( {0,0} \right) & {{{\psi_{a}} > \rho_{a}},{{{\psi_{a}} + \rho_{a}} \leq 0}}\end{matrix} \right.$

-   -   to make the parameters meet a constraint condition        ∥ψ_(a)∥≤ρ_(a), a=1,2, . . . ,A, and calculating ψ _(p) ^((t));

step 3.4.3: obtaining, according to a formula d_(q) ^((t))=ψ _(p)^((t))−ψ ^((t)), a search direction d_(q) ^((t)) of solving theapproximate quadratic optimization problem with a constraint conditionat the t^(th) iteration;

step 3.4.4: obtaining a learning step size η_(q) ^((t)) of the searchdirection d_(q) ^((t)) by using a nonmonotone line search:

${{Q\left( {{\overset{\_}{\psi}}^{(t)} + {\eta_{q}^{(t)}d_{q}^{(t)}}} \right)} \leq {{\max\limits_{{t - k} \leq i \leq t}{Q\left( {\overset{\_}{\psi}}^{(i)} \right)}} + {l_{3}\eta_{q}^{(t)}{\nabla{Q\left( {\overset{\_}{\psi}}^{(t)} \right)}^{T}}d_{q}^{(t)}}}},$l₃ ∈ (0, 1);

and

step 3.4.5: updating the parameters according to a formula ψ ^((t+1))=ψ^((t))+η_(q) ^((t))d_(q) ^((t)), and determining whether the quantity oftimes of evaluation of the target function of the approximate quadraticoptimization problem with a constraint condition is greater than apreset constant T_(e),

-   -   wherein if not, the process returns to step 3.4.1, or if yes,        the algorithm stops, to obtain a solution ψ ^(*) of the        approximate quadratic optimization problem with a constraint        condition.

Preferably, a calculation method of the feasible descending directiond^((m)) in step 3.5 includes:

-   -   at the m^(th) iteration of the specialized complex-valued        projected quasi-Newton algorithm, first calculating a solution ψ        ^(*) of the quadratic optimization problem with a constraint        condition by using the spectral projected gradient algorithm,        and then obtaining the feasible descending direction d^((m)) of        the original constrained optimization problem

${{\min\limits_{\overset{\_}{\psi}}E} = {E^{\prime} + {\lambda{\sum_{a = 1}^{A}{\sqrt{❘\psi_{a}❘}\rho_{a}}}}}},$s.t ψ_(a) ≤ ρ_(a), a = 1, 2, …, A

at the m^(th) iteration according to a formula d^((m))=ψ ^(*)−ψ ^((m)).

Preferably, the used Armijo line search to determine an approximatelearning step size η^((m)) in step 3.5 is specifically:

-   -   determining an appropriate learning step size η^((m)) by using        Armijo line search at the m^(th) iteration of the specialized        complex-valued projected quasi-Newton algorithm:

E(ψ ^((m))+η^((m)) d ^((m)))≤E(ψ ^((m)))+l ₄η^((m)) ∇E(ψ ^((m)))^(T) d^((m)),

-   -   where l₄ ∈ (0, 1), d^((m)) represents a feasible descending        direction of the original constrained optimization problem

${{\min\limits_{\overset{\_}{\psi}}E} = {E^{\prime} + {\lambda{\sum_{a = 1}^{A}{\sqrt{❘\psi_{a}❘}\rho_{a}}}}}},$s.t ψ_(a) ≤ ρ_(a), a = 1, 2, …, A

at the m^(th) iteration, and ∇E(ψ ^((m))) represents a gradient of thetarget function E at ψ ^((m)).

Preferably, the updating ψ by using d^((m)) and η^((m)), and updatingthe matrices S and R in step 3.6 include:

-   -   updating, according to a formula ψ ^((m+1))=ψ        ^((m))+η^((m))d^((m)), a parameter vector ψ that needs to be        optimized in the feedforward complex-valued neural network,    -   calculating s^((m))=ψ ^((m+1))−ψ ^((m)) and r^((m))=∇E(ψ        ^((m+1)))−∇E(ψ ^((m))), storing the information of s^((m)) and        r^((m)) into the matrices S and R, and discarding s^((m−τ)) and        r^((m−τ)) of the (m−τ)^(th) iteration from the matrices S and R        , to implement the update of S and R.

Preferably, the preset iteration termination condition in step 3 isspecifically:

-   -   the quantity of times of evaluation of the target function        during the training of the feedforward complex-valued neural        network reaches a preset maximum quantity of times of        evaluation, or a variation between the values of the target        function in two consecutive iterations is less than a preset        threshold or a maximum variation of an adjustable parameter in ψ        is less than a preset threshold.

Compared with the prior art, the foregoing technical solutions of thepresent invention have the following advantages:

-   -   (1) In the present invention, a Group Lasso regularization term        is introduced into a traditional mean square error function to        obtain a target function E for the training of a feedforward        complex-valued neural network, so that redundant input neurons        and hidden neurons can be effectively deleted during the        training process, to implement optimization of a network        structure and parameter vectors.    -   (2) A feedforward complex-valued neural network is trained by        using a specialized complex-valued projected quasi-Newton        algorithm, and gain coefficients of activation functions may be        optimized together as adjustable parameters, so that the adverse        impact of the activation function falling in a saturated area        during the training process is overcome. In addition, weights of        redundant neurons in the feedforward complex-valued neural        network are directly reset, where a threshold does not need to        be set in advance, and redundant neurons can be directly deleted        without causing any impact on a final output of a model, that        is, the optimization of both a network structure and parameter        vectors can be implemented.    -   (3) A Group Lasso regularization term is introduced and a        feedforward complex-valued neural network is trained by using a        specialized complex-valued projected quasi-Newton algorithm, and        optimization of a network structure and parameters is        implemented, so that the structure of the feedforward        complex-valued neural network is compact, the generalization        performance of a network model is enhanced, and the feedforward        complex-valued neural network has smaller errors for wind speed        prediction, thereby improving the accuracy of prediction.

BRIEF DESCRIPTION OF THE DRAWINGS

To make the content of the present invention clearer and morecomprehensible, the present invention is further described in detailbelow according to specific embodiments of the present invention and theaccompanying drawings.

FIG. 1 is a flowchart of the present invention.

FIG. 2 shows the training process of a feedforward complex-valued neuralnetwork by using a specialized complex-valued projected quasi-newtonalgorithm in the present invention.

FIG. 3 is a flowchart of a training method using a specializedcomplex-valued projected quasi-newton algorithm in the presentinvention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

The present invention is further described below with reference to theaccompanying drawings and specific embodiments, to enable a personskilled in the art to better understand and implement the presentinvention. However, the embodiments are not used to limit the presentinvention.

In the description of the present invention, it needs to be understoodthat the term “include” is intended to cover a non-exclusive inclusion.For example, a process, method, system, product or device that includesa series of steps or units not only includes those specified steps orunits, but optionally further includes steps or units that are notspecified, or optionally further includes other steps or units that areinherent to these processes, methods, products or devices.

Referring to the flowchart in FIG. 1 , an embodiment of a wind speedprediction method based on a feedforward complex-valued neural networkaccording to the present invention includes the following steps:

Step 1: Acquire data used for wind speed prediction, arrange the data asa data set, and divide the data set into a training set and a predictionset. The data set is z= [z¹, z², . . . , z^(P)]^(T), and P represents adimension of an input, including six groups of parameters: an averagewind speed value, an average wind direction value, a standard deviation,an atmospheric pressure, a temperature, and a humidity. The data set isacquired from historical data. The value of P in this embodiment is 6.The elements in z=[z¹, z², . . . , z^(P)]^(T) respectively correspondsto the average wind speed value, the average wind direction value, thestandard deviation, the atmospheric pressure, the temperature, and thehumidity.

Step 2: Construct a model of a feedforward complex-valued neuralnetwork, where a parameter vector in the feedforward complex-valuedneural network is formed by adjustable parameters including connectionweights between neurons, biases of neurons, and gain coefficients ofactivation functions. The model of the feedforward complex-valued neuralnetwork includes P input neurons (a quantity P of the input neurons iskept consistent with the dimension P of the data set), N hidden neurons,and Q output neurons. A parameter vector ψ in the feedforwardcomplex-valued neural network is a column vector. All the adjustableparameters are arranged in an order to obtain the parameter vector ψ:

${\Psi = {\begin{bmatrix}{\left( w_{1}^{R} \right)^{T},\left( w_{1}^{I} \right)^{T},\ldots,\left( w_{P}^{R} \right)^{T},\left( w_{P}^{I} \right)^{T},\left( b_{1}^{R} \right)^{T},\left( b_{1}^{I} \right)^{T},\left( \sigma_{1}^{R} \right)^{T},\left( \sigma_{1}^{I} \right)^{T},} \\{\left( v_{1}^{R} \right)^{T},\left( v_{1}^{I} \right)^{T},\ldots,\left( v_{N}^{R} \right)^{T},\left( v_{N}^{I} \right)^{T},\left( b_{2}^{I} \right)^{T},\left( b_{2}^{R} \right)^{T},\left( \sigma_{2}^{R} \right)^{T},\left( \sigma_{2}^{I} \right)^{T}}\end{bmatrix}^{T} = \begin{bmatrix}{\left( {{ri}\left( w_{1} \right)} \right)^{T},\ldots,\left( {{ri}\left( w_{P} \right)} \right)^{T},\left( {{ri}\left( b_{1} \right)} \right)^{T},\left( \sigma_{1}^{R} \right)^{T},\left( \sigma_{1}^{I} \right)^{T},} \\{\left( {{ri}\left( v_{1} \right)} \right)^{T},\ldots,\left( {{ri}\left( v_{N} \right)} \right)^{T},\left( {{ri}\left( b_{2} \right)} \right)^{T},\left( \sigma_{2}^{R} \right)^{T},\left( \sigma_{2}^{I} \right)^{T}}\end{bmatrix}^{T}}},$

where w_(p) represents a complex vector formed by connection weightsbetween a p^(th) input neuron and the hidden-layer neurons, b₁represents a complex vector formed by biases of all the hidden-layerneurons, σ₁ represents a complex vector formed by gain coefficients ofactivation functions of the hidden-layer neurons, v_(n) represents acomplex vector formed by connection weights between an n^(th)hidden-layer neuron and the output layer neurons, b₂ is a complex vectorformed by biases of all the output layer neurons, σ₂ represents acomplex vector formed by gain coefficients of activation functions ofthe output layer neurons, and the superscript T represents transpose;and the superscript R represents a vector formed by real parts ofcorresponding complex vectors, the superscript I represents a vectorformed by imaginary parts of corresponding complex vectors, and

${{{ri}( \bullet )} = \begin{pmatrix}( \bullet )^{R} \\( \bullet )^{I}\end{pmatrix}};$

and a gain coefficient of an activation function may be optimizedtogether as an adjustable parameter, the activation functions of thehidden-layer neurons are prevented from falling in a saturated area, sothat the adverse impact of the activation function falling in asaturated area on a training process is overcome.

When the input is a j^(th) training sample, a hidden output vector ofthe feedforward complex-valued neural network is h_(j)=f_(C)(Wz_(j)+b₁),and an output vector of an output layer is o_(j)=f_(C)(Vh_(j)+b₂), wheref_(C)(⋅) represents an activation function, W=[w₁, w₂, . . . , w_(p)] isa weight matrix between the input neurons and the hidden neurons, andz_(j) is a j^(th) input sample of the feedforward complex-valued neuralnetwork; and V=[v₁, v₂, . . . , v_(N)] is a weight matrix between thehidden neurons and the output neurons.

The parameter vector ψ formed by parameters including connection weightsbetween neurons, biases of neurons, and gain coefficients of activationfunctions in the feedforward complex-valued neural network areinitialized.

Step 3: Introduce a Group Lasso regularization term to construct atarget function for training of the feedforward complex-valued neuralnetwork, and convert the training of the feedforward complex-valuedneural network into solving a constrained optimization problem.Iterative training is performed by using the training set and aspecialized complex-valued projected quasi-Newton algorithm, so thatredundant input neurons and hidden neurons are deleted, therebyimplementing optimization of both a network structure and parametervectors.

As shown in FIG. 2 and FIG. 3 , a specific process of step 3 includesthe following steps:

Step 3.1: Introduce a Group Lasso regularization term R_(GL) into atraditional mean square error function E′ to obtain a target function Efor the training.

The Group Lasso regularization term R_(GL)=λΣ_(a=1) ^(A)√{square rootover (|ψ_(a)|)}∥ψ_(a)∥, and the target function E for the training ofthe feedforward complex-valued neural network is:

${{\min\limits_{\psi}E} = {{E^{\prime} + R_{GL}} = {{\frac{1}{2J}{\sum_{j = 1}^{J}{\left( {o_{j} - y_{j}} \right)^{H}\left( {o_{j} - y_{j}} \right)}}} + {\lambda{\sum_{a = 1}^{A}{\sqrt{❘\psi_{a}❘}{\psi_{a}}}}}}}},$where$E^{\prime} = {\frac{1}{2J}{\sum_{j = 1}^{J}{\left( {o_{j} - y_{j}} \right)^{H}\left( {o_{j} - y_{j}} \right)}}}$

is the traditional mean square error function, J is a total number oftraining samples, o_(j) represents an actual output of a j^(th) trainingsample, y_(j) represents a desired output of the j^(th) training sample,and the superscript H represents the conjugate transpose; λ is aregularization coefficient, a=1,2, . . . ,A, A=P+N+2 represents a totalnumber of neurons that may be penalized, that is, P input neurons, Nhidden neurons, and 2 bias nodes; and |⋅| represents a dimension of avector, ∥⋅∥ is a Euclidean norm, and ψ_(a) represents a vector formed byconnection weights between an a^(th) neuron and all neurons in a nextlayer in the feedforward complex-valued neural network.

The Group Lasso regularization term is introduced into the traditionalmean square error function to obtain the target function E for thetraining of the feedforward complex-valued neural network, so thatredundant input neurons and hidden neurons can be effectively deletedduring a training process, to implement optimization of a networkstructure and parameter vectors. Thereby enhancing the generalizationperformance of the model.

Step 3.2: Convert an unconstrained optimization problem into aconstrained optimization problem, that is, introduce a group ofartificial variables ρ_(a) to convert an unconstrained optimizationproblem

${\min\limits_{\psi}E} = {E^{\prime} + {\lambda{\sum_{a = 1}^{A}{\sqrt{❘\psi_{a}❘}{\psi_{a}}}}}}$

into a constrained optimization problem:

${{\min\limits_{\overset{\_}{\psi}}E} = {E^{\prime} + {\lambda{\sum_{a = 1}^{A}{\sqrt{❘\psi_{a}❘}\rho_{a}}}}}},$s.t ψ_(a) ≤ ρ_(a), a = 1, 2, …, A;

and

-   -   define ρ=[ρ₁, ρ₂, . . . , ρ_(A)]^(T) as a real vector formed by        the introduced variables ρ_(a), where a parameter vector that        needs to be optimized during the training process is ψ=[ψ^(T),        ρ^(T)]^(T).

Step 3.3: Train the feedforward complex-valued neural network by usingthe training set and the specialized complex-valued projectedquasi-Newton algorithm, calculate an approximate Hessian matrix H^((m))by using matrices S={s^((i)}) _(i=m-τ) ^(i=m-1) and R={r^((i)}) _(i=m-τ)^(i=m-1) and obtain an approximate quadratic optimization problem with aconstraint condition, where τ is a constant, representing that aparameter variation s^((i)) and a gradient variation r^((i)) of thelatest τ iterations are kept; and s^((i))= ψ ^((i+1))−ψ ^((i)),r^((i))=∇E(ψ ^((i+1)))−∇E(ψ ^((i))), ψ ^((i+1)) represents a parametervector value at an (i+1)^(th) iteration, ψ ^((i)) represents a parametervector value at an i^(th) iteration, ∇E(ψ ^((i))) represents a gradientof the target function E at ψ ^((i)), ∇E(ψ ^((i+1))) represents agradient of the target function E at ψ ^((i+1)), S represents a matrixformed by parameter variations s^((i)) from the (m−τ)^(th) to (m−1)^(th)iterations, R represents a matrix formed by gradient variations r^((i))from the (m−τ)^(th) to (m−1)^(th) iterations, and m represents theiteration index.

Step 3.3.1: Use a calculation formula of the approximate Hessian matrixH^((m)):

H ^((m))=σ^((m)) I−NM ⁻¹ N ^(T),

-   -   where

${\sigma^{(m)} = \frac{\left( r^{({m - 1})} \right)^{T}r^{({m - 1})}}{\left( r^{({m - 1})} \right)^{T}s^{({m - 1})}}},$

r^((m-1))=∇E(ψ ^((m)))−∇E(ψ ^((m-1))), and s^((m-1))=ψ ^((m))−ψ^((m-1)); and I is an identity matrix,

${N = \left\lbrack {\sigma^{(m)}SR} \right\rbrack},{M = \begin{bmatrix}{\sigma^{(m)}S^{T}S} & L \\L & {- D}\end{bmatrix}},$

L is a matrix formed by elements

$L_{ij} = \left\{ {\begin{matrix}{\left( s^{({m - \tau - 1 + i})} \right)^{T}\left( r^{({m - \tau - 1 + j})} \right)} & {i > j} \\0 & {i \leq j}\end{matrix},} \right.$

and D=diag[(s^((m-τ)))^(T)(r^((m-τ))), . . . ,(s^((m-1)))^(T)(r^((m-1)))] is a diagonal matrix.

Step 3.3.2: Obtain the approximate quadratic optimization problem with aconstraint condition at the m^(th) iteration by using the approximateHessian matrix H^((m)):

${\min\limits_{\overset{\_}{\psi}}Q} = {{E\left( {\overset{\_}{\psi}}^{(m)} \right)} + {\left( {\overset{\_}{\psi} - {\overset{\_}{\psi}}^{(m)}} \right)^{T}\nabla{E\left( {\overset{\_}{\psi}}^{(m)} \right)}} + {\frac{1}{2}\left( {\overset{\_}{\psi} - {\overset{\_}{\psi}}^{(m)}} \right)^{T}{H^{(m)}\left( {\overset{\_}{\psi} - {\overset{\_}{\psi}}^{(m)}} \right)}}}$s.tψ_(a) ≤ ρ_(a), a = 1, 2, …, A.

step 3.4: Solve the approximate quadratic optimization problem with aconstraint condition by using a spectral projected gradient algorithm,to obtain a solution ψ* of the approximate quadratic optimizationproblem with a constraint condition.

The approximate quadratic optimization problem with a constraintcondition is solved according to the spectral projected gradientalgorithm to obtain the solution ψ* . The main characteristic of thespectral projected gradient algorithm is that a spectral step size isused as an initial step size, and a learning step size η_(q) ^((t)) isdetermined by using a nonmonotone linear search. A specific form of thespectral step size is as follows:

${\eta_{bb}^{(t)} = \frac{\left( {qs}^{({t - 1})} \right)^{T}{qs}^{({t - 1})}}{\left( {qr}^{({t - 1})} \right)^{T}{qs}^{({t - 1})}}},$

-   -   where qs^((t-1))=ψ ^((t))−ψ ^((t-1)), qr^((t-1))=∇Q(ψ        ^((t)))−∇Q(ψ ^((t-1))), ∇Q(ψ ^((t))) represents a gradient of a        target function Q of the approximate quadratic optimization        problem with a constraint condition at ψ ^((t)), and t        represents the iteration index for optimizing the approximate        quadratic optimization problem with a constraint condition by        using the spectral projected gradient algorithm, that is, the        t^(th) iteration of the spectral projected gradient algorithm.

At the t^(th) iteration, an initial solution ψ ₊ ^((t))=ψ ^((t))−η_(bb)^((t))∇Q(ψ ^((t))) of the target function Q of the approximate quadraticoptimization problem with a constraint condition is first calculated byusing a negative gradient direction. However, ψ ₊ ^((t)) calculated inthis case does not necessarily meet a constraint condition∥ψ_(a)∥≤ρ_(a), a=1,2, . . . ,A.

${P_{\Omega}\left( {\psi_{a},\rho_{a}} \right)} = \left\{ {\begin{matrix}\left( {\psi_{a},\rho_{a}} \right) & {{\psi_{a}} \leq \rho_{a}} \\\left( {{\frac{\psi_{a}}{\psi_{a}}\frac{{\psi_{a}} + \rho_{a}}{2}},\frac{{\psi_{a}} + \rho_{a}}{2}} \right) & {{{\psi_{a}} > \rho_{a}},{{{\psi_{a}} + \rho_{a}} > 0}} \\\left( {0,0} \right) & {{{\psi_{a}} > \rho_{a}},{{{\psi_{a}} + \rho_{a}} \leq 0}}\end{matrix}.} \right.$

Therefore, parameters of each group of neurons are corrected by using aprojection operator to make the parameters meet a constraint condition.For example, for a weight vector ψ₁=ri(w₁) and a parameter ρ₁ of a firstinput neuron, if ∥ψ₁∥≤ρ₁, a weight parameter of the neuron does not needto be corrected. That is, the first case in the foregoing formula takesplace. If ∥ψ₁∥>ρ₁ and ∥ψ₁∥+ρ₁>0, the parameter is corrected to

$\left( {{\frac{\psi_{1}}{\psi_{1}}\frac{{\psi_{1}} + \rho_{1}}{2}},\frac{{\psi_{1}} + \rho_{1}}{2}} \right)$

by using a projection operator to meet the constraint condition. If∥ψ₁∥>ρ₁ and ∥ψ₁∥+ρ₁≤0, the parameter of the neuron is corrected to (0,0) by using a projection operator. That is, the third case in theforegoing formula takes place. In this case, a feasible descendingdirection d_(q) ^((t)) for solving the approximate quadraticoptimization problem with a constraint condition is obtained by usingd_(q) ^((t))=ψ _(p) ^((t))−ψ ^((t)). ψ _(p) ^((t)) is a solutionobtained after the foregoing projection operation is performed on eachgroup of neuron parameters in ψ ₊ ^((t)).

Next, the step size η_(q) ^((t)) is determined by using a nonmonotoneline search. A specific form is as follows:

${{Q\left( {{\overset{\_}{\psi}}^{(t)} + {\eta_{q}^{(t)}d_{q}^{(t)}}} \right)} \leq {{\max\limits_{{t - k} \leq i \leq t}{Q\left( {\overset{\_}{\psi}}^{(i)} \right)}} + {l_{3}\eta_{q}^{(t)}\nabla{Q\left( {\overset{\_}{\psi}}^{(t)} \right)}^{T}d_{q}^{(t)}}}},$

l₃∈(0, 1), the value of k is usually 10, and d_(q) ^((t)) represents asearch direction of the approximate quadratic optimization problem witha constraint condition at the t^(th) iteration. Iterations arerepeatedly performed, and stop when a stop condition of the projectedgradient algorithm is reached, to obtain a solution ψ* to the quadraticoptimization problem with a constraint condition.

To describe the spectral projected gradient algorithm more intuitively,a specific procedure of the spectral projected gradient algorithm issummarized as follows:

Step 3.4.1: Calculate η_(bb) ^((t)) by using a formula

${\eta_{bb}^{(t)} = \frac{\left( {qs}^{({t - 1})} \right)^{T}{qs}^{({t - 1})}}{\left( {qr}^{({t - 1})} \right)^{T}{qs}^{({t - 1})}}},$

and calculate ψ ₊ ^((t)) according to a formula ψ ₊ ^((t))=ψ^((t))−η_(bb) ^((t))∇Q(ψ ^((t))), where qs^((t-1))=ψ ^((t))−ψ ^((t-1)),qr^((t-1))=∇Q(ψ ^((t)))−∇Q(ψ ^((t-1))), ∇Q(ψ ^((t))) represents agradient of the target function of the approximate quadraticoptimization problem with a constraint condition at ψ ^((t)), and trepresents the iteration index for optimizing the approximate quadraticoptimization problem with a constraint condition by using the spectralprojected gradient algorithm.

Step 3.4.2: Correct parameters of each group of neurons in ψ ₊ ^((t)) byusing a projection operator

${P_{\Omega}\left( {\psi_{a},\rho_{a}} \right)} = \left\{ \begin{matrix}\left( {\psi_{a},\rho_{a}} \right) & {{\psi_{a}} \leq \rho_{a}} \\\left( {{\frac{\psi_{a}}{\psi_{a}}\frac{{\psi_{a}} + \rho_{a}}{2}},\frac{{\psi_{a}} + \rho_{a}}{2}} \right) & {{{\psi_{a}} > \rho_{a}},{{{\psi_{a}} + \rho_{a}} > 0}} \\\left( {0,0} \right) & {{{\psi_{a}} > \rho_{a}},{{{\psi_{a}} + \rho_{a}} \leq 0}}\end{matrix} \right.$

to make the parameters meet a constraint condition ∥ψ_(a)∥≤ρ_(a), a=1,2,. . . , A, and calculate ψ _(p) ^((t)).

Weights of redundant neurons in the feedforward complex-valued neuralnetwork are directly reset by using a projection operator, a thresholddoes not need to be set in advance, and redundant neurons can bedirectly deleted without causing any impact on a final output of amodel, that is, the optimized selection of both a network structure andparameter vectors can be implemented, to make the structure of thefeedforward complex-valued neural network compact.

Step 3.4.3: Obtain, according to a formula d_(q) ^((t))=ψ _(p) ^((t))−ψ^((t)), a search direction d_(q) ^((t)) of solving the approximatequadratic optimization problem with a constraint condition at a t^(th)iteration.

Step 3.4.4: Calculate a learning step size η_(q) ^((t)) in the searchdirection d_(q) ^((t)) by using a nonmonotone line search:

${{Q\left( {{\overset{\_}{\psi}}^{(t)} + {\eta_{q}^{(t)}d_{q}^{(t)}}} \right)} \leq {{\max\limits_{{t - k} \leq i \leq t}{Q\left( {\overset{\_}{\psi}}^{(i)} \right)}} + {l_{3}\eta_{q}^{(t)}\nabla{Q\left( {\overset{\_}{\psi}}^{(t)} \right)}^{T}d_{q}^{(t)}}}},{l_{3} \in {\left( {0,1} \right).}}$

Step 3.4.5: Update the parameters according to a formula ψ ^((t+1))=ψ^((t))+η_(q) ^((t))d_(q) ^((t)), and determine whether the quantity oftimes of evaluation of the target function of the approximate quadraticoptimization problem with a constraint condition is greater than apreset constant T_(e). The value of T_(e) in this embodiment is 10.

If not, the process returns to step 3.4.1, or if yes, the algorithmstops, to obtain a solution ψ* of the approximate quadratic optimizationproblem with a constraint condition.

step 3.5: Calculate a feasible descending direction d^((m)) of theoriginal constrained optimization problem

${{\min\limits_{\overset{\_}{\psi}}E} = {E^{\prime} + {\lambda{\sum_{a = 1}^{A}{\sqrt{❘\psi_{a}❘}\rho_{a}}}}}},{{s.t.{\psi_{a}}} \leq \rho_{a}},{a = 1},2,\ldots,A$

at m^(th) iteration by using the solution ψ* of the approximatequadratic optimization problem with a constraint condition, and useArmijo line search to determine an approximate learning step sizeη^((m)).

Step 3.5.1: At the m^(th) iteration of the specialized complex-valuedprojected quasi-Newton algorithm, first calculate a solution ψ* of thequadratic optimization problem with a constraint condition by using thespectral projected gradient algorithm, and then obtain, according to aformula d^((m))=ψ*−ψ ^((m)), the feasible descending direction d^((m))of the original constrained optimization problem

${{\underset{\overset{\_}{\psi}}{\min}E} = {{E^{\prime} + {\lambda{\sum_{a = 1}^{A}{\sqrt{❘\psi_{a}❘}\rho_{a}{s.t}{\psi_{a}}}}}} \leq \rho_{a}}},{a = 1},2,\ldots,A$

at the m^(th) iteration.

Step 3.5.2: Determine an appropriate learning step size η^((m)) by usingArmijo line search at the m^(th) iteration of the specializedcomplex-valued projected quasi-Newton algorithm:

E(ψ ^((m))+η^((m)) d ^((m)))≤E(ψ ^((m)))+l ₄η^((m)) ∇E(ψ ^((m)))^(T) d^((m)),

-   -   where l₄∈(0, 1), d^((m)) represents a feasible descending        direction of the original constrained optimization problem

${{\underset{\overset{\_}{\psi}}{\min}E} = {{E^{\prime} + {\lambda{\sum_{a = 1}^{A}{\sqrt{❘\psi_{a}❘}\rho_{a}{s.t}{\psi_{a}}}}}} \leq \rho_{a}}},{a = 1},2,\ldots,A$

at the m^(t) iteration, and ∇E(ψ ^((m))) represents a gradient of thetarget function E at ψ ^((m)).

Step 3.6: Update ψ by using d^((m)) and η^((m)), and update the matricesS and R,

-   -   update, according to a formula ψ ^((m+1))=ψ        ^((m))+η^((m))d^((m)), a parameter vector ψ that needs to be        optimized in the feedforward complex-valued neural network, and    -   calculate s^((m))=ψ ^((m+1))−ψ ^((m)) and r^((m))=∇E(ψ        ^((m+1)))−∇E(ψ ^((m))), store the informations of s^((m)) and        r^((m)) in matrices S and R, and discard the information about        s^((m-τ)) and r^((m-τ)) of the (m−τ)^(th) iteration from the        matrices S and R, to implement update of S and R. The value of τ        in this embodiment is 10.

Step 3.7: Repeat the specialized complex-valued projected quasi-Newtonalgorithm in step 3.3 to step 3.6 to perform iterative training, andstop the training of the feedforward complex-valued neural network whena preset iteration termination condition is met, to complete iterativetraining of ψ. The preset iteration termination condition isspecifically: the quantity of times of evaluation of the target functionduring the training of the feedforward complex-valued neural networkreaches a preset maximum quantity of times of evaluation or a variationbetween the values of the target function in two consecutive iterationsis less than a preset threshold or a maximum variation of an adjustableparameter in ψ is less than a preset threshold (that is, a stagnatedstate is entered). The training is stopped if the iteration terminationcondition is met, and the trained feedforward complex-valued neuralnetwork is used for wind speed prediction. The process turns to step 3.3to continue to train the feedforward complex-valued neural network ifthe iteration termination condition is not met.

Step 4: Obtain the trained feedforward complex-valued neural network,and input the prediction data into the trained feedforwardcomplex-valued neural network to obtain a wind speed prediction result.A desired output is a complex number y formed by a wind speed and a winddirection.

To further describe the beneficial effects of the present invention,2000 samples are selected for training a feedforward complex-valuedneural network containing 20 hidden neurons in this embodiment, and 100other samples are used to test the performance of the feedforwardcomplex-valued neural network. The present invention (which is named asSC_PQN for convenience) is compared with a split gradient descenttraining method (SCBPG) and a fully complex gradient descent trainingmethod (FCBPG). The training and test errors are presented in Table 1.

TABLE 1 Comparison of result of training and test errors of SC_PQN,SCBPG, and FCBPG SCBPG FCBPG SC_PQN Average training error 0.0663 0.09040.0656 Average test error 0.1246 0.1605 0.0840 Quantity of deletedhidden-layer 0 0 14 Quantity of deleted input neurons 0 0 2

As can be seen from Table 1, when the present invention uses aspecialized complex-valued projected quasi-Newton algorithm, both theaverage training error and the average test error are minimal, and theoptimal training and prediction effects are obtained. In addition,compared with other training methods, both the quantity of deletedhidden neurons and the quantity of deleted input neurons are provided inthe present invention. After these redundant neurons are deleted, anetwork structure can be optimized to obtain a feedforwardcomplex-valued neural network with more compact structure, therebyenhancing the generalization performance of the model.

Compared with the prior art, the foregoing technical solution of thepresent invention has the following advantages: For the wind speedprediction method based on a feedforward complex-valued neural networkin the present invention: (1) During a training process, a Group Lassoregularization term is introduced into a traditional mean square errorfunction to obtain a target function E for the training of a feedforwardcomplex-valued neural network, so that redundant input neurons andhidden neurons can be effectively deleted during training, to implementoptimization of a network structure and parameter vectors. (2) Afeedforward complex-valued neural network is trained by using aspecialized complex-valued projected quasi-Newton algorithm, gaincoefficients of activation functions may be optimized together asadjustable parameters, so that the adverse impact of activation functionfalling in a saturated area during a training process is overcome. Inaddition, weights of redundant neurons in the feedforward complex-valuedneural network are directly reset, where a threshold does not need to beset in advance, and redundant neurons can be directly deleted withoutcausing any impact on a final output of a model, that is, theoptimization of both a network structure and parameter vectors can beimplemented. (3) A Group Lasso regularization term is introduced and afeedforward complex-valued neural network is trained by using aspecialized complex-valued projected quasi-Newton algorithm, andoptimization of a network structure and parameters are implemented, sothat the structure of the feedforward complex-valued neural network iscompact, the generalization performance of a network model is enhanced,and the feedforward complex-valued neural network has smaller errors forwind speed prediction, thereby improving the accuracy of prediction.

A person skilled in the art should understand that the embodiments ofthe present application may be provided as a method, a system or acomputer program product. Therefore, the present application may use aform of hardware only embodiments, software only embodiments, orembodiments with a combination of software and hardware. Moreover, thepresent application may use a form of a computer program product that isimplemented on one or more computer-usable storage media (including butnot limited to a disk memory, a CD-ROM, an optical memory, and the like)that include computer usable program code.

The present application is described with reference to the flowchartsand/or block diagrams of the method, the device (system), and thecomputer program product according to the embodiments of the presentapplication. It should be understood that computer program instructionsmay be used to implement each process and/or each block in theflowcharts and/or the block diagrams and a combination of a processand/or a block in the flowcharts and/or the block diagrams. Thesecomputer program instructions may be provided for a general-purposecomputer, a dedicated computer, an embedded processor, or a processor ofany other programmable data processing device to generate a machine, sothat the instructions executed by a computer or a processor of any otherprogrammable data processing device generate an apparatus forimplementing a specific function in one or more processes in theflowcharts and/or in one or more blocks in the block diagrams.

These computer program instructions may be stored in a computer readablememory that can instruct the computer or any other programmable dataprocessing device to work in a specific manner, so that the instructionsstored in the computer readable memory generate an artifact thatincludes an instruction apparatus. The instruction apparatus implementsa specific function in one or more processes in the flowcharts and/or inone or more blocks in the block diagrams.

These computer program instructions may be loaded onto a computer oranother programmable data processing device, so that a series ofoperations and steps are performed on the computer or the anotherprogrammable device, thereby generating computer-implemented processing.Therefore, the instructions executed on the computer or the otherprogrammable device provide steps for implementing a specific functionin one or more processes in the flowcharts and/or in one or more blocksin the block diagrams.

Obviously, the foregoing embodiments are merely examples for cleardescription, rather than a limitation to implementations. For a personof ordinary skill in the art, other changes or variations in differentforms may also be made based on the foregoing description. Allimplementations cannot and do not need to be exhaustively listed herein.Obvious changes or variations that are derived there from still fallwithin the protection scope of the invention of the present invention.

1. A wind speed prediction method based on a feedforward complex-valuedneural network, comprising steps of: step 1: acquiring data for windspeed prediction, arranging the data as a data set, and dividing thedata set into a training set and a prediction set; step 2: constructinga feedforward complex-valued neural network, and initializing aparameter vector ψ, wherein the parameter vector ψ is formed byadjustable parameters comprising connection weights between neurons,biases of hidden and output neurons, and gain coefficients of activationfunction; step 3: introducing a Group Lasso regularization term toconstruct a target function for training the feedforward complex-valuedneural network, and converting the training of the feedforwardcomplex-valued neural network into solving a constrained optimizationproblem; and training the feedforward complex-valued neural network byusing the training data and a specialized complex-valued projectedquasi-Newton algorithm, and stopping the training until a presetiteration termination condition is met; and step 4: obtaining thetrained feedforward complex-valued neural network, and inputting theprediction set into the trained feedforward complex-valued neuralnetwork to obtain a wind speed prediction result.
 2. The wind speedprediction method based on a feedforward complex-valued neural networkaccording to claim 1, wherein in step 2, the constructed feedforwardcomplex-valued neural network comprises P input neurons, N hidden-layerneurons, and Q output neurons, the parameter vector ψ in the feedforwardcomplex-valued neural network is a column vector, and all the adjustableparameters are arranged in an order to obtain the parameter vector ψ:$\begin{matrix}{\psi = \left\lbrack {\left( w_{1}^{R} \right)^{T},\left( w_{1}^{I} \right)^{T},\ldots,\left( w_{P}^{R} \right)^{T},\left( w_{P}^{I} \right)^{T},\left( b_{1}^{R} \right)^{T},\left( b_{1}^{I} \right)^{T},\left( \sigma_{1}^{R} \right)^{T},\left( \sigma_{1}^{I} \right)^{T},} \right.} \\\left. {}{\left( v_{1}^{R} \right)^{T},\left( v_{1}^{I} \right)^{T},\ldots,\left( v_{N}^{R} \right)^{T},\left( v_{N}^{I} \right)^{T},\left( b_{2}^{I} \right)^{T},\left( b_{2}^{R} \right)^{T},\left( \sigma_{2}^{R} \right)^{T},\left( \sigma_{2}^{I} \right)^{T}} \right\rbrack^{T} \\{= \left\lbrack {\left( {{ri}\left( w_{1} \right)} \right)^{T},\ldots,\left( {{ri}\left( w_{P} \right)} \right)^{T},\left( {{ri}\left( b_{1} \right)} \right)^{T},\left( \sigma_{1}^{R} \right)^{T},\left( \sigma_{1}^{I} \right)^{T},} \right.} \\\left. {}{\left( {{ri}\left( v_{1} \right)} \right)^{T},\ldots,\left( {{ri}\left( v_{N} \right)} \right)^{T},\left( {{ri}\left( b_{2} \right)} \right)^{T},\left( \sigma_{2}^{R} \right)^{T},\left( \sigma_{2}^{I} \right)^{T}} \right\rbrack^{T}\end{matrix}$ wherein w_(p) represents a complex vector formed byconnection weights between a p^(th) input neuron and the hidden neurons,b₁ represents a complex vector formed by biases of all the hiddenneurons, σ₁ represents a complex vector formed by gain coefficients ofactivation functions of the hidden neurons, v_(n) represents a complexvector formed by connection weights between an n^(th) hidden-layerneuron and the output neurons, b₂ is a complex vector formed by biasesof all the output neurons, σ₂ represents a complex vector formed by gaincoefficients of activation functions of the output neurons, and thesuperscript T represents transpose; and the superscript R represents avector formed by real parts of corresponding complex vector, thesuperscript I represents a vector formed by imaginary parts ofcorresponding complex vector, and${{r{i\left( . \right)}} = \begin{pmatrix}\left( . \right)^{R} \\\left( . \right)^{I}\end{pmatrix}};$ and a hidden output vector of the feedforwardcomplex-valued neural network is h_(j)=f_(C)(Wz_(j)+b₁), and an outputvector of the output layer is o_(j)=f_(C)(Vh_(j)+b₂), wherein f_(C)(⋅)represents an activation function, W=[w₁, w₂, . . . , w_(P)] is a weightmatrix between the input and hidden neurons, and z_(j) is a j^(th) inputsample of the feedforward complex-valued neural network; and V=[v₁, v₂,. . . , v_(N)] is a weight matrix between the hidden and output neurons.3. The wind speed prediction method based on a feedforwardcomplex-valued neural network according to claim 1, wherein the step 3further comprises: step 3.1: introducing a Group Lasso regularizationterm R_(GL) into a traditional mean square error function E′ to obtain atarget function E during the training of the feedforward complex-valuedneural network; step 3.2: introducing a group of artificial variablesρ_(a) to convert an unconstrained optimization problem${\underset{\psi}{\min}E} = {E^{\prime} + {\lambda{\sum}_{a = 1}^{A}\sqrt{❘\psi_{a}❘}{\psi_{a}}}}$into a constrained optimization problem${{\min\limits_{\overset{\_}{\psi}}E} = {E^{\prime} + {\lambda{\sum}_{a = 1}^{A}\sqrt{❘\psi_{a}❘}\rho_{a}}}},{{s.t.{\psi_{a}}} \leq \rho_{a}},{a = 1},2,\ldots,{A;}$and defining ρ=[ρ₁, ρ₂, . . . , ρ_(A)]^(T) as a real vector formed bythe introduced variables ρ_(a), wherein a parameter vector that needs tobe optimized during the training process in this case is ψ=[ψ^(T),ρ^(T)]^(T); step 3.3: calculating an approximate Hessian matrix H^((m))by using matrices S={s^((i)}) _(i=m-τ) ^(i=m-1) and R={r^((i)}) _(i=m-τ)^(i=m-1), and obtaining an approximate quadratic optimization problemwith a constraint condition, wherein τ is a constant, representing thata parameter variation s^((i)) and a gradient variation r^((i)) of thelatest τ iterations are kept; and s^((i))=ψ ^((i+1))−ψ ^((i)),r^((i))=∇E(ψ ^((i+1)))−∇E(ψ ^((i))), ψ ^((i+1)) represents a parametervector value of ψ at the (i+1)^(th) iteration, ψ ^((i)) represents aparameter vector value of ψ at the i^(th) iteration, ∇E(ψ ^((i)))represents a gradient of the target function E at ψ ^((i)), ∇E(ψ^((i+1))) represents a gradient of the target function E at ψ ^((i+1)),S represents a matrix formed by parameter variations s^((i)) from the(m−τ)^(th) to (m−1)^(th) iterations, R represents a matrix formed bygradient variations r^((i)) from the (m−τ)^(th) to (m−1)^(th)iterations, and m represents an iteration number; step 3.4: solving theapproximate quadratic optimization problem with a constraint conditionby using a spectral projected gradient algorithm, to obtain a solutionψ* of the approximate quadratic optimization problem with a constraintcondition; step 3.5: calculating a feasible descending direction d^((m))of the original constrained optimization problem${{\min\limits_{\overset{\_}{\psi}}E} = {E^{\prime} + {\lambda{\sum}_{a = 1}^{A}\sqrt{❘\psi_{a}❘}\rho_{a}}}},{{s.t.{\psi_{a}}} \leq \rho_{a}},{a = 1},2,\ldots,A$at the m^(th) iteration by using the solution ψ* of the approximatequadratic optimization problem with a constraint condition, and usingArmijo line search to determine an approximate learning step sizeη^((m)); step 3.6: updating ψ by using d^((m)) and η^((m)), and updatingthe matrices S and R; step 3.7: repeating step 3.3 to step 3.6, andstopping the training of the feedforward complex-valued neural networkuntil the preset iteration termination condition is met.
 4. The windspeed prediction method based on a feedforward complex-valued neuralnetwork according to claim 3, wherein the Group Lasso regularizationterm in step 3.1 is R_(GL)=λΣ_(a=1) ^(A)√{square root over(|ψ_(a)|)}∥ψ_(a)∥, and the target function E for the training of thefeedforward complex-valued neural network is:${{\min\limits_{\psi}E} = {{E^{\prime} + R_{GL}} = {{\frac{1}{2J}{\sum}_{j = 1}^{J}\left( {o_{j} - y_{j}} \right)^{H}\left( {o_{j} - y_{j}} \right)} + {\lambda{\sum}_{a = 1}^{A}\sqrt{❘\psi_{a}❘}{\psi_{a}}}}}},$${{wherein}E^{\prime}} = {\frac{1}{2J}{\sum}_{j = 1}^{J}\left( {o_{j} - y_{j}} \right)^{H}\left( {o_{j} - y_{j}} \right)}$is the traditional mean square error function, J is a total number oftraining samples, o_(j) represents an actual output of a j^(th) trainingsample, y_(j) represents a desired output of the j^(th) training sample,and the superscript H represents the conjugate transpose; λ is aregularization coefficient, a=1,2, . . . ,A, A=P+N+2 represents a totalnumber of neurons that may be penalized, that is, P input neurons, Nhidden neurons, and 2 bias nodes; and |⋅| represents a dimension of avector, ∥⋅∥ is a Euclidean norm, and ψ_(a) represents a vector formed byconnection weights between an a^(th) neuron and all neurons in a nextlayer in the feedforward complex-valued neural network.
 5. The windspeed prediction method based on a feedforward complex-valued neuralnetwork according to claim 3, wherein the step 3.3 further comprises:step 3.3.1: using a calculation formula of the approximate Hessianmatrix H^((m)):H ^((m))=σ^((m)) I−NM ⁻¹ N ^(T), wherein${\sigma^{(m)} = \frac{\left( r^{({m - 1})} \right)^{T}r^{({m - 1})}}{\left( r^{({m - 1})} \right)^{T}s^{({m - 1})}}},$r^((m-1))=∇E(ψ ^((m)))−∇E(ψ ^((m-1))), and s^((m-1))=ψ ^((m))−ψ^((m-1)); N=[σ^((m))S R], ${M = \begin{bmatrix}{\sigma^{(m)}S^{T}S} & L \\L & {- D}\end{bmatrix}},$ L is a matrix formed by elements$L_{ij} = \left\{ {\begin{matrix}\left( s^{({m - \tau - 1 + i})} \right)^{T} & {{\left( r^{({m - \tau - 1 + j})} \right)i} > j} \\0 & {i \leq j}\end{matrix},} \right.$ I is an identity matrix, andD=diag[(s^((m-τ)))^(T)(r^((m-τ))), . . . , (s^((m-1)))^(T)(r^((m-1)))]is a diagonal matrix; and step 3.3.2: obtaining the approximatequadratic optimization problem with a constraint condition at the m^(th)iteration by using the approximate Hessian matrix H^((m)):${\underset{\overset{–}{\psi}}{\min}Q} = {{E\left( {\overset{¯}{\psi}}^{(m)} \right)} + {\left( {\overset{¯}{\psi} - {\overset{¯}{\psi}}^{(m)}} \right)^{T}{\nabla{E\left( {\overset{¯}{\psi}}^{(m)} \right)}}} + {\frac{1}{2}\left( {\overset{¯}{\psi} - {\overset{¯}{\psi}}^{(m)}} \right)^{T}{H^{(m)}\left( {\overset{¯}{\psi} - {\overset{¯}{\psi}}^{(m)}} \right)}}}$s.t.ψ_(a) ≤ ρ_(a), a = 1, 2, …, A
 6. The wind speed prediction methodbased on a feedforward complex-valued neural network according to claim3, wherein the step 3.4 comprises: step 3.4.1: calculating η_(bb) ^((t))by using a formula${\eta_{bb}^{(t)}==\frac{\left( {qs^{({t - 1})}} \right)^{T}qs^{({t - 1})}}{\left( {qr^{({t - 1})}} \right)^{T}qs^{({t - 1})}}},$and calculating ψ ₊ ^((t)) according to a formula ψ ₊ ^((t))=ψ^((t))−η_(bb) ^((t))∇Q(ψ ^((t))), wherein qs^((t-1))=ψ ^((t))−ψ^((t-1)), qr^((t-1))=∇Q(ψ ^((t)))−∇Q(ψ ^((t-1))), ∇Q(ψ ^((t)))represents a gradient of the target function of the approximatequadratic optimization problem with a constraint condition at ψ ^((t)),and t represents a iteration number for optimizing the approximatequadratic optimization problem with a constraint condition by using thespectral projected gradient algorithm; step 3.4.2: correcting parametersof each group of neurons in ψ ₊ ^((t)) by using a projection operator${P_{\Omega}\left( {\psi_{a},\rho_{a}} \right)} = \left\{ \begin{matrix}\left( {\psi_{a},\rho_{a}} \right) & {{\psi_{a}} \leq \rho_{a}} \\\left( {{\frac{\psi_{a}}{\psi_{a}}\frac{{\psi_{a}} + \rho_{a}}{2}},\frac{{\psi_{a}} + \rho_{a}}{2}} \right) & {{{\psi_{a}} > \rho_{a}},{{{\psi_{a}} + \rho_{a}} > 0}} \\\left( {0,0} \right) & {{{\psi_{a}} > \rho_{a}},{{{\psi_{a}} + \rho_{a}} \leq 0}}\end{matrix} \right.$ to make the parameters meet a constraint condition∥ψ_(a)∥≤ρ_(a), a=1,2, . . . ,A, and calculating ψ _(p) ^((t)); step3.4.3: obtaining, according to a formula d_(q) ^((t))=ψ _(p) ^((t))−ψ^((t)), a search direction d_(q) ^((t)) of solving the approximatequadratic optimization problem with a constraint condition at the t^(th)iteration; step 3.4.4: obtaining a learning step size η_(q) ^((t)) ofthe search direction d_(q) ^((t)) by using a nonmonotone line search:${{Q\left( {{\overset{\_}{\psi}}^{(t)} + {\eta_{q}^{(t)}d_{q}^{(t)}}} \right)} \leq {{\max\limits_{{t - k} \leq i \leq t}{Q\left( {\overset{\_}{\psi}}^{(i)} \right)}} + {l_{3}\eta_{q}^{(t)}{\nabla{Q\left( {\overset{\_}{\psi}}^{(t)} \right)}^{T}}d_{q}^{(t)}}}},{{l_{3} \in \left( {0,1} \right)};}$and step 3.4.5: updating the parameters according to a formula ψ^((t+1))=ψ ^((t))+η_(q) ^((t))d_(q) ^((t)) and determining whether aquantity of times of evaluation of the target function of theapproximate quadratic optimization problem with a constraint conditionis greater than a preset constant T_(e), wherein if not, the processreturns to step 3.4.1, or if yes, the algorithm stops, to obtain asolution ψ* of the approximate quadratic optimization problem with aconstraint condition.
 7. The wind speed prediction method based on afeedforward complex-valued neural network according to claim 3, whereina calculation method of the feasible descending direction d^((m)) instep 3.5 comprises: at the m^(th) iteration of the specializedcomplex-valued projected quasi-Newton algorithm, first calculating asolution ψ* of the quadratic optimization problem with a constraintcondition by using the spectral projected gradient algorithm, and thenobtaining the feasible descending direction d^((m)) of the originalconstrained optimization problem${{\min\limits_{\overset{\_}{\psi}}E} = {E^{\prime} + {\lambda{\sum}_{a = 1}^{A}\sqrt{❘\psi_{a}❘}\rho_{a}}}},{{{s.t}{\psi_{a}}} \leq \rho_{a}},{a = 1},2,\ldots,A$at the m^(th) iteration according to a formula d^((m))=ψ*−ψ ^((m)). 8.The wind speed prediction method based on a feedforward complex-valuedneural network according to claim 3, wherein the used Armijo line searchto determine an approximate learning step size η^((m)) in step 3.5comprises: determining an appropriate learning step size η^((m)) byusing Armijo line search at the m^(th) iteration of the specializedcomplex-valued projected quasi-Newton algorithm:E(ψ ^((m))+η^((m)) d ^((m)))≤E(ψ ^((m)))+l ₄η^((m)) ∇E(ψ ^((m)))^(T) d^((m)), wherein l₄∈(0, 1), d^((m)) represents a feasible descendingdirection of the original constrained optimization problem${{\min\limits_{\overset{\_}{\psi}}E} = {{E^{\prime} + {\lambda{\sum}_{a = 1}^{A}\sqrt{❘\psi_{a}❘}\rho_{a}{s.t}{\psi_{a}}}} \leq \rho_{a}}},{a = 1},2,\ldots,A$at the m^(th) iteration, and ∇E(ψ ^((m))) represents a gradient of thetarget function E at ψ ^((m)).
 9. The wind speed prediction method basedon a feedforward complex-valued neural network according to claim 3,wherein the updating ψ by using d^((m)) and η^((m)), and updating thematrices S and R in step 3.6 comprise: updating, according to a formulaψ ^((m+1))=ψ ^((m))+η^((m))d^((m)), a parameter vector ψ that needs tobe optimized in the feedforward complex-valued neural network,calculating s^((m))=ψ ^((m+1))−ψ ^((m)) and , r^((m))=∇E(ψ^((m+1)))−∇E(ψ ^((m))) storing the information of s^((m)) and r^((m))into the matrices S and R, and discarding s^((m-τ)) and r^((m-τ)) of the(m−τ)^(th) iteration from the matrices S and R, to implement the updateof S and R.
 10. The wind speed prediction method based on a feedforwardcomplex-valued neural network according to claim 1, wherein the presetiteration termination condition in step 3 is: a quantity of times ofevaluation of the target function during the training of the feedforwardcomplex-valued neural network reaches a preset maximum quantity of timesof evaluation, or a variation between the values of the target functionin two consecutive iterations is less than a preset threshold, or amaximum variation of an adjustable parameter in ψ is less than a presetthreshold.